Skip to content

Conversation

@VeckoTheGecko
Copy link
Contributor

This PR supercedes #2447
implementing the following additional functionality:

  • Refactor out the FieldSet.from_nemo() functionality into a convert.py module with a nemo_to_sgrid() function. Also added corresponding tests.
  • Make the input parameters of the nemo_to_sgrid function more closely align with the input datasets
  • Update FieldSet.from_sgrid_conventions to infer mesh from the metadata included in the dataset
  • Add get_value_by_id() function to Grid*DMetadata classes allowing to easily refer to parts of the SGRID metadata model

Future work:

  • Improve logging and metadata conversion convert.py - its important that we refactor this module so that its clearer, and have logging so that changes to the input dataset are communicated to the user. To support this, we need
  • Better testing against different NEMO datasets
    • Reach out to the NEMO community asking for ncdump output so that better see the types of nemo models out there

I'll make an issue to track this future work.

erikvansebille and others added 28 commits January 9, 2026 16:24
And update of tutorial_nemo_curvilinear
And updating tutorial
ANd expading comments on loading in the dataset
To use same ordering as in XLinear
This requires smaller selection_dict for the isel, so hopefully faster code
And also separating offset calculation into its own helper function
It's not clear why this is here, nor why removing it causes a test failure. To be investigated another time.

EDIT: This was introduced in
Parcels-code@c311fba
- though we're investigating if this can be implemented another way
  since there should be no particular difference with NEMO
Gradually reducing the dependency on the `mesh` param
Updates the API for conversion to be more closely aligned with the input data. Also handles the U and V fields separately - correctly assigning the dimension naming before merging into a single dataset.
Avoid assuming there's a U and V field. Maybe this should be refactored later...
@VeckoTheGecko VeckoTheGecko force-pushed the fieldset_from_nemo-nick branch from 6d12f21 to 0b9c787 Compare January 9, 2026 15:25
@VeckoTheGecko
Copy link
Contributor Author

I'll do a full self review next week. Feel free to take a look in the meantime @erikvansebille if you'd like

@VeckoTheGecko
Copy link
Contributor Author

Also still a few failing tests that I need to sort out...

Copy link
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice already. Some first comments below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and other tutorials still need to be updated to use the new convert workflow? Do you want me to do that? Could be a good test-case how user-friendly the convert functions are

Copy link
Contributor Author

@VeckoTheGecko VeckoTheGecko Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they would need to be updated.

Do you want me to do that? Could be a good test-case how user-friendly the convert functions are

That would be great!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to change the netcdf ingestion in the Nemo3D-tutorial, but get the error below

Can you check what's going on, @VeckoTheGecko?

data_folder = parcels.download_example_dataset("NemoNorthSeaORCA025-N006_data")
ds_fields = xr.open_mfdataset(
    data_folder.glob("ORCA*.nc"),
    data_vars="minimal",
    coords="minimal",
    compat="override",
)
ds_coords = xr.open_dataset(data_folder / "coordinates.nc", decode_times=False)
ds_fset = parcels.convert.nemo_to_sgrid(U=ds_fields["uo"], V=ds_fields["vo"], W=ds_fields["wo"], coords=ds_coords)
fieldset = parcels.FieldSet.from_sgrid_conventions(ds_fset)
---------------------------------------------------------------------------
CoordinateValidationError                 Traceback (most recent call last)
Cell In[7], line 9
      2 ds_fields = xr.open_mfdataset(
      3     data_folder.glob("ORCA*.nc"),
      4     data_vars="minimal",
      5     coords="minimal",
      6     compat="override",
      7 )
      8 ds_coords = xr.open_dataset(data_folder / "coordinates.nc", decode_times=False)
----> 9 ds_fset = parcels.convert.nemo_to_sgrid(U=ds_fields["uo"], V=ds_fields["vo"], coords=ds_coords)
     10 fieldset = parcels.FieldSet.from_sgrid_conventions(ds_fset)

File ~/Codes/ParcelsCode/src/parcels/convert.py:224, in nemo_to_sgrid(coords, **fields)
    222 ds = _discover_U_and_V(ds, _NEMO_CF_STANDARD_NAME_FALLBACKS)
    223 ds = _maybe_create_depth_dim(ds)
--> 224 ds = _maybe_bring_UV_depths_to_depth(ds)
    225 ds = _drop_unused_dimensions_and_coords(ds, _NEMO_DIMENSION_COORD_NAMES)
    226 ds = _assign_dims_as_coords(ds, _NEMO_DIMENSION_COORD_NAMES)

File ~/Codes/ParcelsCode/src/parcels/convert.py:58, in _maybe_bring_UV_depths_to_depth(ds)
     56 def _maybe_bring_UV_depths_to_depth(ds):
     57     if "U" in ds.variables and "depthu" in ds.U.coords and "depth" in ds.coords:
---> 58         ds["U"] = ds["U"].assign_coords(depthu=ds["depth"].values).rename({"depthu": "depth"})
     59     if "V" in ds.variables and "depthv" in ds.V.coords and "depth" in ds.coords:
     60         ds["V"] = ds["V"].assign_coords(depthv=ds["depth"].values).rename({"depthv": "depth"})

File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/common.py:664, in DataWithCoords.assign_coords(self, coords, **coords_kwargs)
    661 else:
    662     results = self._calc_assign_results(coords_combined)
--> 664 data.coords.update(results)
    665 return data

File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/coordinates.py:636, in Coordinates.update(self, other)
    630 # special case for PandasMultiIndex: updating only its dimension coordinate
    631 # is still allowed but depreciated.
    632 # It is the only case where we need to actually drop coordinates here (multi-index levels)
    633 # TODO: remove when removing PandasMultiIndex's dimension coordinate.
    634 self._drop_coords(self._names - coords_to_align._names)
--> 636 self._update_coords(coords, indexes)

File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/coordinates.py:1104, in DataArrayCoordinates._update_coords(self, coords, indexes)
   1101 def _update_coords(
   1102     self, coords: dict[Hashable, Variable], indexes: dict[Hashable, Index]
   1103 ) -> None:
-> 1104     validate_dataarray_coords(
   1105         self._data.shape, Coordinates._construct_direct(coords, indexes), self.dims
   1106     )
   1108     self._data._coords = coords
   1109     self._data._indexes = indexes

File ~/Codes/ParcelsCode/.pixi/envs/docs/lib/python3.14/site-packages/xarray/core/coordinates.py:1328, in validate_dataarray_coords(shape, coords, dim)
   1326 for d, s in v.sizes.items():
   1327     if d in sizes and s != sizes[d]:
-> 1328         raise CoordinateValidationError(
   1329             f"conflicting sizes for dimension {d!r}: "
   1330             f"length {sizes[d]} on the data but length {s} on "
   1331             f"coordinate {k!r}"
   1332         )

CoordinateValidationError: conflicting sizes for dimension 'depthu': length 75 on the data but length 1 on coordinate 'depthu'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to reproduce this on the latest version of this branch and couldnt get an error . I get this as ds_fset

(Pdb) p ds_fset
<xarray.Dataset> Size: 164MB
Dimensions:   (time: 6, depth: 75, y_center: 201, x: 151, y: 201, x_center: 151)
Coordinates:
  * time      (time) datetime64[ns] 48B 2000-01-02T12:00:00 ... 2000-01-27T12...
  * depth     (depth) float32 300B 0.0 1.024 2.103 ... 5.596e+03 5.8e+03
  * y_center  (y_center) int64 2kB 0 1 2 3 4 5 6 ... 194 195 196 197 198 199 200
  * x         (x) int64 1kB 0 1 2 3 4 5 6 7 ... 143 144 145 146 147 148 149 150
  * y         (y) int64 2kB 0 1 2 3 4 5 6 7 ... 193 194 195 196 197 198 199 200
  * x_center  (x_center) int64 1kB 0 1 2 3 4 5 6 ... 144 145 146 147 148 149 150
    lat       (y, x) float64 243kB ...
    lon       (y, x) float64 243kB ...
Data variables:
    U         (time, depth, y_center, x) float32 55MB dask.array<chunksize=(1, 1, 171, 151), meta=np.ndarray>
    V         (time, depth, y, x_center) float32 55MB dask.array<chunksize=(1, 1, 171, 151), meta=np.ndarray>
    W         (time, depth, y, x) float32 55MB dask.array<chunksize=(1, 1, 171, 151), meta=np.ndarray>
    grid      int64 8B 0
Attributes:
    long_name:           sea_water_x_velocity
    units:               m/s
    online_operation:    average
    interval_operation:  1440 s
    interval_write:      5 d
    cell_methods:        time: mean (interval: 1440 s)

Is this still relevant do you think?

VeckoTheGecko and others added 5 commits January 12, 2026 10:19
Co-authored-by: Erik van Sebille <[email protected]>
Avoiding a FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
@VeckoTheGecko

This comment was marked as outdated.

@VeckoTheGecko
Copy link
Contributor Author

I'm getting failures in the unit tests due to 26c6d55 , how should we go about choosing the interpolators @erikvansebille (and is this blocked by #2461 ? Should we fix this in a sepatare PR?)

Fixed in f726363

@VeckoTheGecko
Copy link
Contributor Author

I'm happy with the state of this - happy to merge and iterate in future PRs. @erikvansebille let me know if you have more feedback before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

2 participants